## Deep dive: Imbalanced targets {: #deep-dive-imbalanced-targets }

In AML and Transaction Monitoring, the SAR rate is usually very low (1%–5%, depending on the detection scenarios); sometimes it could be even lower than 1% in extremely unproductive scenarios. In machine learning, such a problem is called _class imbalance_. The question becomes, how can you mitigate the risk of class imbalance and let the machine learn as much as possible from the limited known-suspicious activities?

DataRobot offers different techniques to handle class imbalance problems. Some techniques:

* Evaluate the model with <a target="_blank" rel="noopener noreferrer" href="https://docs.datarobot.com/en/docs/modeling/reference/model-detail/opt-metric.html#optimization-metrics"><b>different metrics</b></a>. For binary classification (the false positive reduction model here, for example), LogLoss is used as the default metric to rank models on the Leaderboard. Since the rule-based system is often unproductive, which leads to a very low SAR rate, it’s reasonable to take a look at a different metric, such as the SAR rate in the top 5% of alerts in the prioritization list. The objective of the model is to assign a higher prioritization score with a high risk alert, so it’s ideal to have a higher rate of SAR in the top tier of the prioritization score. In the example shown in the image below, the SAR rate in the top 5% of prioritization score is more than 70% (the original SAR rate is less than 10%), which indicates that the model is very effective in ranking the alert based on the SAR risk.

* DataRobot also provides flexibility for modelers when tuning hyperparameters which could also help with the class imbalance problem. In the example below, the Random Forest Classifier is tuned by enabling the balance_boostrap (a random sample with an equal amount of SAR and non-SAR alerts in each decision tree in the forest); you can see the validation score of the new ‘Balanced Random Forest Classifier’ model is slightly better than the parent model.

![](images/aml-change-metric.png)

* You can also use <a target="_blank" rel="noopener noreferrer" href="https://docs.datarobot.com/en/docs/modeling/build-models/adv-opt/smart-ds.html#smart-downsampling"><b>Smart Downsampling</b></a> (from the Advanced Options tab) to intentionally downsample the majority class (i.e., non-SAR alerts) in order to build faster models with similar accuracy.
